Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available July 1, 2026
-
Predicting the evolutionary patterns of emerging and endemic viruses is key for mitigating their spread. In particular, it is critical to rapidly identify mutations with the potential for immune escape or increased disease burden. Knowing which circulating mutations pose a concern can inform treatment or mitigation strategies such as alternative vaccines or targeted social distancing. In 2021, Hie B, Zhong ED, Berger B, Bryson B. 2021 Learning the language of viral evolution and escape.Science371, 284–288. (doi:10.1126/science.abd7331) proposed that variants of concern can be identified using two quantities extracted from protein language models, grammaticality and semantic change. These quantities are defined by analogy to concepts from natural language processing. Grammaticality is intended to be a measure of whether a variant viral protein is viable, and semantic change is intended to be a measure of potential for immune escape. Here, we systematically test this hypothesis, taking advantage of several high-throughput datasets that have become available, and also comparing this model with several more recently published machine learning models. We find that grammaticality can be a measure of protein viability, though methods that are trained explicitly to predict mutational effects appear to be more effective. By contrast, we do not find compelling evidence that semantic change is a useful tool for identifying immune escape mutations.more » « lessFree, publicly-accessible full text available April 1, 2026
-
Abstract We explore sequence determinants of enzyme activity and specificity in a major enzyme family of terpene synthases. Most enzymes in this family catalyze reactions that produce cyclic terpenes—complex hydrocarbons widely used by plants and insects in diverse biological processes such as defense, communication, and symbiosis. To analyze the molecular mechanisms of emergence of terpene cyclization, we have carried out in-depth examination of mutational space around (E)-β-farnesene synthase, an Artemisia annua enzyme which catalyzes production of a linear hydrocarbon chain. Each mutant enzyme in our synthetic libraries was characterized biochemically, and the resulting reaction rate data were used as input to the Michaelis–Menten model of enzyme kinetics, in which free energies were represented as sums of one-amino-acid contributions and two-amino-acid couplings. Our model predicts measured reaction rates with high accuracy and yields free energy landscapes characterized by relatively few coupling terms. As a result, the Michaelis–Menten free energy landscapes have simple, interpretable structure and exhibit little epistasis. We have also developed biophysical fitness models based on the assumption that highly fit enzymes have evolved to maximize the output of correct products, such as cyclic products or a specific product of interest, while minimizing the output of byproducts. This approach results in nonlinear fitness landscapes that are considerably more epistatic. Overall, our experimental and computational framework provides focused characterization of evolutionary emergence of novel enzymatic functions in the context of microevolutionary exploration of sequence space around naturally occurring enzymes.more » « less
-
Abstract A key question in evolutionary biology concerns the relative importance of different sources of adaptive genetic variation, such as de novo mutations, standing variation, and introgressive hybridization. A corollary question concerns how allelic variants derived from these different sources may influence the molecular basis of phenotypic adaptation. Here, we use a protein-engineering approach to examine the phenotypic effect of putatively adaptive hemoglobin (Hb) mutations in the high-altitude Tibetan wolf that were selectively introgressed into the Tibetan mastiff, a high-altitude dog breed that is renowned for its hypoxia tolerance. Experiments revealed that the introgressed coding variants confer an increased Hb–O2 affinity in conjunction with an enhanced Bohr effect. We also document that affinity-enhancing mutations in the β-globin gene of Tibetan wolf were originally derived via interparalog gene conversion from a tandemly linked β-globin pseudogene. Thus, affinity-enhancing mutations were introduced into the β-globin gene of Tibetan wolf via one form of intragenomic lateral transfer (ectopic gene conversion) and were subsequently introduced into the Tibetan mastiff genome via a second form of lateral transfer (introgression). Site-directed mutagenesis experiments revealed that the increased Hb–O2 affinity requires a specific two-site combination of amino acid replacements, suggesting that the molecular underpinnings of Hb adaptation in Tibetan mastiff (involving mutations that arose in a nonexpressed gene and which originally fixed in Tibetan wolf) may be qualitatively distinct from functionally similar changes in protein function that could have evolved via sequential fixation of de novo mutations during the breed’s relatively short duration of residency at high altitude.more » « less
An official website of the United States government
